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Abstract 

Human mobility is investigated using a continuum approach that allows to calculate the probability to 
observe a trip to any arbitrary region, and the fluxes between any two regions. The considered description 
offers a general and unified framework, in which previously proposed mobility models like the gravity 
model, the intervening opportunities model, and the recently introduced radiation model are naturally 
resulting as special cases. A new form of radiation model is derived and its validity is investigated using 
observational data offered by commuting trips obtained from the United States census data set, and the 
mobility fluxes extracted from mobile phone data collected in a western European country. The new 
modeling paradigm offered by this description suggests that the complex topological features observed in 
large mobility and transportation networks may be the result of a simple stochastic process taking place 
on an inhomogeneous landscape. 

Introduction 

Human mobility in form of migration or commuting becomes increasingly important nowadays due to 
many obvious reasons [T[: (i) traveling becomes easier, quicker and more affordable; (ii) some borders 
(like the ones inside EU) are more transparent or even inexistent for travelers; (iii) the density and 
growth of the population and their gross national product presents large territorial inequalities, which 
naturally induces mobility; (iv) the main and successful employers concentrate their location in narrow 
geographic regions where living costs are high, hence even in developed countries the employees are forced 
to commute; (v) large cities grow with higher rates, optimizing their functional efficiency and creating 
the necessary intellectual and economic surplus for sustaining this growth [2] . This higher growth rate of 
the population can be achieved only by relocating the highly skilled work-force from smaller cities. Here 
we propose a unified continuum approach to explain the resulting mobility patterns. 
Understanding and modeling the general patterns of human mobility is a long-standing problem in soci- 
ology and human geography with obvious impact on business and the economy [3] . Research in this area 
got new perspectives, arousing the interest of physicists [3J[S] due to the availability of several accurate 
and large scale electronic data, which helps track the mobility fluxes [BHE] and check the hypotheses 
and results of different models. Traditionally mobility fluxes were described by models originating from 
physics. The best-known is the gravity model [BJH] that postulates fluxes in analogy with the Newton's 
law of gravitation, where the number of commuters between two locations is proportional to their popula- 
tions (i.e. the 'demographic mass') and decays with the square of the distance between them. Beside the 
well-known gravity model, several other models were used like the generalized potential model [101111] , 
the intervening opportunities model [12] or the random utility model [13] . Recently, a parameter- free 
radiation model has been proposed, leading to mobility patterns in good agreement with the empirical 



2 



observations [14] . The model was developed assuming a spatially discretizcd settlement structure, and 
consequently it operates with a discretized flux topology on the edges of a complete graph. Here we 
consider and test a continuum approach to this model operating with fluxes between any two regions, 
and show that several other mobility models can be derived within the same framework. This novel 
approach based on the continuum description offers a new modeling and data interpretation paradigm 
for understanding human mobility patterns. 



Results 

The modeling framework 

The radiation model |14j has been originally formulated to estimate commuting fluxes, i.e. the average 
number of commuters traveling per unit time between any two locations in a country. The key idea 
is that while the home-to-work trip is a daily process, it is determined by a one-time choice, i.e. the 
job selection. Therefore commuting fluxes reflect the human behavior in the choice of the employment. 
In real life many variables can affect the employment's choice, from personal aspirations to economic 
considerations, but for the sake of simplicity only the most influential variables are considered in the 
model: the salary a job pays (or more generally, the working conditions), and the distance between the 
job and home. The main idea behind the model is that an individual accepts the closest job with better 
pay: each individual travels to the nearest location where she/he can improve her/his current working 
conditions (benefits). With this assumption, the probability P > (z\a) that an individual with benefit z 
refuses the closest a offers is: 

P>(z|a)=p<(z) a (1) 

where a is the number of open positions in the area within a circle of radius r(a) centered in the origin 
location, and p<(z) = J Q dx p(x) is the cumulative distribution function of the benefits. Equation (JXJ) is 
equivalent with assuming that the rejection of a job offers with benefits less or equal to z are independent 
events. 

Making different assumptions and approximations on the benefit distribution p(z), one can obtain several 
formulas for the number of trips between locations. Below we present four examples: the original radiation 
model, the classic intervening opportunities (10) model |12| . a uniform selection model, and a novel 
radiation model with selection. 



The original radiation model. If we solve Eq. (jTJ) assuming that the benefit distribution p(z) is a 
continuous function, we recover the original radiation model's formula |14) . Indeed, we calculate the 
probability P > (a) of not accepting one of the closest a job offers by integrating Eq. ([T]) over the benefits: 

P>(o) = ^°°dz*|Mp > (z|a) (2) 

, dp<(z) , ,„ 1 . . 

dz-!^p<(z) a = ——. (3) 
az a + 1 



The intervening opportunities model. We can also show how the classical 10 model [12 |. fT5] can 

be included within the same framework as a degenerate case. Consider the situation in which the benefit 
distribution is singular, i.e. all jobs are exactly equivalent p(z) = 4-p<{z) = S(z* — z) and p<(z) = 
1 — 0(z* — z) (where O is the Hcavisidc function). In this case we have to specify the individual's 
behavior when s/he receives a job offer identical to her/his current one: this corresponds to setting a 
specific value to the step function at the discontinuity point, 0(0) = k. If k = 0, then the individual 
will travel to an infinite distance; while if k = 1, the individual accepts the job in the closest location. If 
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< k < 1, then the individual accepts each offer with probability k and refuses it with probability 1 — k. 
Applying Eq. © we obtain 

P>(a) = (1 -n) a = c- aL (4) 

where L = — hi(l — k) = K if K = 0. 

The uniform selection model. When a-I<l,a good approximation of Eq. ((4]) is P>{a) = 1 — aL, 
which corresponds to randomly select one of the available job opportunities, irrespective of the benefits 
and the distance. Generalizing this interpretation, we can dehne a model on a finite space containing N 
average job openings per unit time in which the accepted job is selected uniformly at random, and thus 
P>(a) = 1 - a/N. 



The radiation model with selection. Let us assume that the benefit distribution p(z) is continuous 
as in the original radiation model, whereas the probability to accept any offer is reduced by a factor 
(1 — A) with A € [0,1]. As a consequence, the probability that an individual with benefit z accepts 
an offer has to be replaced by a reduced value: p>{z) = (1 — X)p > (z) 7 Vz > 0. This process can be 
interpreted as a commuting population who is willing to accept better offers with probability (1 — A), or 
who is aware only of a fraction (1 — A) of the available job offers. This is equivalent to a combination of 
the radiation model and the intervening opportunities model described above (here 1 — A = k). In this 
case P > (z\a) = p<{z) a = [1 — p > (z)] a , and the probability to refuse the closest a offers is 

P>(a) = I™ dz^M[\ + (l-\) p< (z)] a = 
Jo dz 

1 dw[X + (1 - A)w] a = ~^J~J]~xj (5) 

Note that when A = we recover the original radiation model ©, while a A > causes a shift of the 
median of P>(a) towards higher values of a. In particular, for A, A' — > 1 the following approximation 

holds: P>(a, A') « P> (a , A^j , where we made explicit the dependence on A. The validity of this 

relationship can be verified by defining k = 1 — A and expanding around k ~ 0: 

, N 1-(1- K ') a+1 

P> a, 1 - «') = 77 t w 

>{ ' ' k'(o + 1) 

1 - [1 - K'(a + 1) + ^(a + l)a + 0(n' 3 )} _ K 'a 
K '{a + 1) _ ~ T" 

, k' n 1 - (1 - K )°(«7«)+i 

P>(o-,l-«) = 7 ;/\ , ^ ~ 

1 - [1 - K{a{n'/K) + 1) + ^(o(k'/k) + l)a(K'/K) + C(k 3 )] _ 
«(o(k'/«) + 1) 

l-y+0(K'«fl) (7) 

The difference is of the order 0(k' 2 a) — 0(k' n a) , thus |P>(a, 1 — «;') — P>(a— , 1 — k)| — > when k, k' -> 0. 
Note that Eq. (J7]) follows immediately from Eq. §§§ by substituting k! >->■ « and a H> an'/n. We can derive 
the dependence of the median on the rescaling of the parameter A: if with A = 0.9 the median is a defined 



0{n' z a) (6) 



and 
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by 0.5 = P>(a|0.9), with A' = 0.99 the median is ten times higher, i.e. -§^5, = 10 a. By varying the 
parameter A it is thus possible to adjust the median of the distribution P>(a), which is equivalent to set 
a characteristic length of the trips. 

These examples show the versatility of the radiation model's formalism, which can successfully provide 
an explanation to several probability distributions P> (a) observed empirically in different contexts |12|ll4j . 
The probability density, P(a), to accept one of the offers between a and a + da for a unit da value can 
be obtained from P>(a) by derivation. To be more specific, let us consider the original radiation model. 
From Eq.[3]we have P(a) = — ^P>(a) = 1/(1 + a) 2 . Let n(x) be the density of job offers at point (x,y) 
(in polar coordinate, (r, 6), we will use the same notation for the density n(r,0)). Then one gets the 
following expression for the number of job offers within a distance r from xo, a(r) = /| x _ Xo |< r dx n(x) = 

J Q r dr' J Q 27r d9 r' n(r',9) and da = rdr d6 n(r,9). Thus the probability to accept an offer within a 
region at distance between r and r + dr, V(r)dr, is given by 

^, x , n , dr da(r) r dr L * d6 n(r, 6) 

V(r)dr = P(a)da = TTPJ— = FT 2 8 

w w [1 + a(r)] 2 dr [1 + a(r)] 2 v ; 

This also suggests that 

P Xo (x)dxdy = ^ + a ^ 2 dxdy (9) 

is the probability to travel from the origin, Xo , to an area dxdy centered at the spatial point x. In general, 
P Xo (x) has the following simple expression for any model presented above: P Xo (x) = — P^[a(|x — xo|)] • 
n(x). From Eq. (JHJ we can derive the probability P Xo (D) of a trip from the origin to a generic region D 
(see Fig. [TJl) as 

p xo{ d) = / dxP X0 ( X) = r^-m-t^. do) 



D 



[l + a(r)] 2 n(r) 



where n(r) = r d6 n{r,9) is the radial job offers' density, and ho{r) = r fg*£\ d# n(r, 9) is the job 
offers' density in D at distance r from xo- If the radial job offers' density has small variations around its 
average between r\ and r2, i.e. ft_o(r) w r ^ J™ drho(r) = {fiD)r\ an d n(r) w (n)^ Vr <G [ri,r2], then 
we can derive a simple approximated formula for P Xo (D) 



m r n(r) J^drMO 
X ° l ] ~ L a*{rf £drn(r) 



r 2 da'(r) 



dr^ a M = (n) 

a*(r) 2 a*(r2) — o*(ri) a*(r 2 ) ■ a*(ri) 

where a* = 1 + a, and a(£>) = J D dxn(x) is the number of job offers in D. 

This equation is especially important because data are usually collected as fluxes in a discretized space, 
whose regions arc defined according to the local administrative subdivision (e.g. counties or munic- 
ipalities). P Xo (D) has a particularly simple expression if we consider the probability P(n,a) to ac- 
cept one of the n offers between a and a + n, corresponding to the ring in Fig. [TJd. This is given by 
P(n,a) = J^ +n dxP(x) = P > (a) - P>(a + n) = 1/(1 + a) - 1/(1 + a + n), which in the limit n -> tends 
to n P(a). If we only consider trips outside a circular region centered on the origin location and containing 
m job offers, then the probability P(n, a\m) to accept one of the n offers between a and a + n given that 
none of the closest m offers has been accepted, is P(n,a)/P > (m) = (i+a)[i+a+n) • Note that P(n,a\m) is 
the same probability of one trip derived in the original radiation model's discrete formulation |14| with 
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the only difference being that here we have 1 + m instead of m (a is equal to s + m) . 
It is important to observe that the equations derived for P{a) are correctly normalized when the to- 
tal number of job offers, A~ tot , is infinite and therefore finite-size corrections are required in real- world 
applications |16j . The normalized probability is P(a)/J\f, where the normalization constant is J\f = 

daP(a) = P>(0) - P>(iV tot ). The correction to P(a) is of the order O(l/N tot ), which in most 
cases is very small given that usually N tot ^> 1. This normalization scheme has a straightforward mech- 
anistic interpretation: it offers another try at job selection for individuals who during their first job 
search did not find any job offer with better benefit than their current one. Other kinds of normalization 
procedures that combine two of the models presented above are also possible. If , for example, we assume 
that the individuals who did not find a better job in their first try decide to select the offer with the 
highest benefit, even if it does not exceed their current one, (a mechanism corresponding to the random 
selection model) the normalized probability we obtain is P(a) + P > (N tot )a/N tot . Therefore, there are 
multiple ways to normalize the models, each capturing a different selection mechanism. This suggests 
that a systematic investigation of finite size effects could also help understand the mechanisms underlying 
job selection. 

Comparison with empirical data 

In Fig. [2] we apply the original parameter-free radiation model (Eq. [3]) and the one-parameter radiation 
model with selection (Eq. [5]) to commuting data among United States' counties. We show the agreement 
between the theoretical P(a) = P{a\m) ■ P > (m) distributions and the collapses predicted by the original 
radiation model, Fig. [2Jd, and the radiation model with selection, Fig. [2b. In Fig. [3] we compare the 
theoretical distributions P{a) of the original radiation model, the radiation model with selection, and the 
10 model, to the empirical distributions extracted from a mobile phone database of a western European 
country. For a description of the data sets and the analyses performed see the section Materials and 
Methods. 

An advantage of the proposed approach is that it is defined for a continuous spatial density of job offers, 
and its results are thus independent of any particular space subdivision in discrete locations. This feature 
solves some consistency issues present in other mobility models defined on a discretized space. Consider 
for example the gravity law [6l[9lll7] , the prevailing framework to predict population movement |18H20j , 
cargo shipping volume [21] , inter-city phone calls [22] , as well as bilateral trade flows between nations [23] . 
The gravity law's probability of one trip from an area with population m to an area with population 
n (assuming that population is proportional to the number of job offers) at distance r is obtained by 
fitting a formula like P(n,r\m) oc m a, nr f(r) to previous mobility data. As shown in [14) . the values of 
the best-fit parameters a and f3 are strongly dependent on the spatial subdivision considered, raising the 
problem of deciding which subdivision gives the correct results. 

Also, the continuous formalism developed here helps finding a solution to the issue concerning the ad- 
ditivity of the fluxes frequently encountered in discrete formulations. As an example, consider two 
adjacent areas, 1 and 2 with populations ri\ and n 2 respectively, at the same distance r from the ori- 
gin location. The gravity law predicts T(l) = Cm a n^ f(r) and T(2) = Cm a n^f(r) travelers to 1 
and 2 respectively. If we consider a different spatial subdivision, in which locations 1 and 2 are now 
grouped together forming a single location, 1 + 2, and we calculate the number of travelers we obtain 
T(l + 2) = Cm a { ni + n 2 ) p f{r) ^ T(l) + T(2) unless /3 = or /3 = 1. If the exponent /3 is different from 
one, the additivity requirement does not hold and the difference in the estimated trips can be considerably 
high. For example, if /3 = 0.5 and m = n 2 = 5000, then AT = T(l) + T(2) - T(l + 2) oc 141 - 100 = 41, 
i.e. a 41% relative difference. The additivity of the fluxes is a necessary property required to any mo- 
bility model in order to be self-consistent. We can easily verify that all models derivable from Eq. ([I]) 
have the additivity property. This is a consequence of the linearity of the integral in Eq. (fTOf . In 
fact, for every two regions D\ n D 2 = and D\ U D 2 = D\ +2 we have (T Xo (l + 2)) oc P Xo (Di +2 ) = 
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f D dx P Xo (x) + J D2 dxP Xo (x) = (T Xo (l)) + (T Xo (2)), for a generic P Xo (x) . We observe that it is possible 

to develop a continuum formalism for the gravity model that fulfils the additivity constraint by assuming 
that the probability to travel from location x to location x is P|™(x) = n(x ) Q n(x)"/(|x — x |). The 
average number of travelers from region O to region D is (Tq(D)) = J Q dxo ra(xo) J D dx P x ™(x) and 
because of the linearity of the integral on D the fluxes are additive. 

We can use the continuum approach to investigate the relationship between a region's population and 
the total number of travelers from that region outwards (i.e. the commuters whose destination is outside 
the region). It is often assumed that the number of commuters is proportional to the region's popula- 
tion. This is the case, for example, for the commuting fluxes measured by the US census 2000 [3]. 
We can check the validity of this assumption by writing the average number of commuters leaving 
a region O as (To{O c )) = J Q dx n(x )P Xo (O c ), where O c = R 2 \ O is the complement of O, and 
Px (O c ) = J Q a dxP Xo (x) is the probability for an individual in x to travel outside O (cf. Eq. [TOj). We 
can easily calculate (To{O c )) if we make the simplifying assumptions that the number of job offers in a 
region is proportional to the region's population (see the section Materials and Methods for details) , that 
the population density is uniform, i.e. n(x) = n, and O is a circle of radius R (see Fig. Then 



(T o (0 c )) = f dx n(x ) / dxP X0 (x) = (2vrn) / dr r 
JO J O c Jo 
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1 fR+ro / f>2 _ 2 _ 2 \ j-cc 

drV(r) cos' 1 2 + / drV(r) 

R-r \ 2r 0r J J R+ra 



(12) 



where V(r)dr is the probability to travel to a distance r + dr (cf. Eq. [8]). For the original radiation 
model V(r) = 2irnr / (1 + nirr 2 ) 2 , and Eq. (fT2"j) can be calculated exactly and has the following asymptotic 
limits: (T (O c ))(R) ~ R 2 n if R < nT 1 ! 2 , and (T {O c ))(R) - R^Hn if R > rT 1 ! 2 . The same asymptotic 
behaviour is obtained for the 10 model, with T(r) = ie" Ln7rr2 : (T (O c ))(R) - R 2 if R < (nP)" 1 / 2 , and 
(T (O c ))(P) ~ R if R > (nL)- 1 / 2 . For both models if the size of the region, R, is sufficiently small then 
the number of commuters, (To(O c ))(P), is proportional to the total population of the region. When R 
becomes larger than a characteristic size only the individuals living close to the boundary have a non-zero 
chance of travelling outside O. 

A further generalization of the model could take into account the fact that Euclidean distance is not 
appropriate in situations where geographical barriers exist and/or travel facilities are heterogencously 
distributed. In this case one introduces a metric tensor (x) and the square distance between neighboring 
positions at point x is (dr) 2 = Yl^j gij(x.)dxidxj with x\ = x and X2 = y. In this case Eq. ([9]) is rewritten 

as Px (x)da;ida;2 = pTj^xTjp V ' 9{ yL )°^ x i c ^ x 2i where \J <?(x) = dct(gij) is a local parameter of the model. 



Discussion 



The fundamental Eq. (JTJ represents a unified framework to model mobility and transportation patterns. 
In particular, we showed how the intervening opportunity model can be regarded as a degenerate 
case of the radiation model, corresponding to a situation in which the benefit differences arc not taken 
into account in the employment's choice. We also explained the advantages of a continuous approach to 
model mobility fluxes, we derived the appropriate discrctized expressions that guarantee the consistency 
of our predictions on any discrete spatial subdivision, verifying that the fluxes additivity requirement 
holds. 

Furthermore, our approach also provides an insight on the theoretical foundation of the most common 
types of gravity models. Indeed, when the space is homogeneous and the job's distribution is fractal, 
a(r) is independent of the point of origin, i.e. a(r) = pr dp where dp and p are the fractal dimension 
and an average density of job offers, respectively. Equation (fTTj) for the probability, P(D), to observe a 
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trip to a generic region D within distances n and r-i from the origin becomes (n = a(D) is the number 
of job offers in D) P(D) ss [P > (prf F ) - P>(pr2 F )] d „ n — ww- In particular, for the original radiation 

pr 2 -pry 

model, Eq. <j3j> , the average number of trips to a region D containing n job offers is T(D) oc ^ p2r2 n ri y F , 

whereas for the intervening opportunities model, Eq. T(D) oc — pr j ~ pri d . These two classes of 

deterrence functions f(r), power law and exponential, arc actually the two most used form of gravity 
models jT71[2"UI[2~7j . Moreover, our approach provides an interpretation to the gravity model's fitting 
parameters. First, the exponents a and j3 are both one when the benefits are spatially uncorrelatcd, 
i.e. the benefit distributions at the local (regional) and global (country) scales are the same. If a or /3 
differ from one it means that there are regions where job offerings with higher or lower benefits tend to 
concentrate. Second, the exponent of the power law is predicted to be two times the fractal dimension 
of the job offers, dp, whereas the exponential deterrence function should be substituted with a stretched 
exponential with shape parameter dp and a characteristic length of the order of (pL)~ 1 / dF . Thus, when 
the spatial displacement of the potential trip's destinations is a fractal, the radiation model's formalism 
offers a theoretical derivation of the gravity models from first principles. 

In conclusion, we have developed a general framework for unifying the theoretical foundation of a broad 
class of human mobility models. The used continuum approach allows for a consistent description of 
mobility fluxes between any delimited regions. The successful comparison with real mobility fluxes ex- 
tracted from two different data sources confirms that our approach not only provides a theoretically sound 
modeling framework, but also a good quantitative agreement with experimental data. This suggests that 
the decision process we assumed for the job selection also captures the basic decision mechanism related 
to the choice of the destinations for other activities (shopping, leisure, ...). On the other hand, our study 
suggests that the weighted network representing the mobility fluxes among geographic regions can be the 
result of a stochastic process consisting of many independent events. This approach is somehow comple- 
mentary to the theory of optimal transportation networks [25H30] that describes the patterns observed 
in different natural and artificial systems solely as the adaptation to a global optimization principle (e.g. 
leaf venations, river networks, power grids, road and airport networks). The modeling framework we 
propose provides also a plausible example of spontaneous bottom- up design of transportation networks. 
Indeed, we show how complex patterns can arise even in those systems lacking a global control on the 
network topology, or a long-term evolutionary selection mechanism of the optimal structure. 



Materials and Methods 

Analysis on the inter-county commuting trips extracted from United States' 
Census data 

The data on US commuting trips can be freely downloaded from 
http://www.census.gov/population/www/cen2000/commuting/index.html. 

The files were compiled from Census 2000 responses to the long-form (sample) questions on where in- 
dividuals worked, and provide all the work destinations for people who live in each county. The data 
contain information on 34,116,820 commuters in 3,141 counties. 

Demographic data containing the population and the geographic coordinates of the centroids of each 
county can be freely downloaded from https://www.census.gov/geo/www/gazetteer/places2k.html. 
Our goal is to use the US commuting data to calculate the empirical distribution P(a) = —j^P > {a) and 
compare it to the theoretical predictions of the original radiation model, Eq. ([3]), and the radiation model 
with selection, Eq. (0. 

We assume that the number of employment opportunities in every county, ajobs, is proportional to the 
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county's population, a pop , i.e. aj bs = a pop ■ c, where c < 1 is the ratio between the average number of 
job offers considered by an individual (i.e. the ones known and of potential interest) over the popula- 
tion. Under this assumption, if we calculate the probability P(a) using the population instead of the job 
openings the resulting distribution is simply rescaled as P(aj bs/c)/c. 

From the census data we obtain the fraction of individuals who live in county i with population to and 
work in county j that lies beyond a circle containing a population a as P(a) = Tij/m = (Tij/Ti)(Ti/m) — 
P(a\m) ■ P > (m), where TJy is the number of commuters from i to j, and Tj is the total number of com- 
muters from i to all other counties. It follows that upon rescaling with P > (m), all the P(a\m) should 
collapse on the theoretical distribution P(a). This is what we want to test in Fig. [2J 
First, we divide the commuting fluxes in deciles according to the population of the origin county, 
to. Then, for each set we calculate the distributions P(a\m) (Fig. [2k), and the rescaled distributions 
P(a) = P(a\m) ■ P > (to) with to equal to the mean origin population of the counties in each set, and using 
the P> (to) of Eq. © in Fig. [2b, and of Eq. © in Fig. [2b. The value of the parameter A = 0.999988 
has been obtained by maximizing the likelihood that the observed fluxes are an outcome of the model. 
The discrepancy observed at very high a (^$> 10 7 ) can be the result of boundary (finite-size) effects that 
become relevant at large populations, corresponding to long distances. Also, the fluctuations at very 
small a values are due to the resolution limit encountered when a « m. The parameter A is close to 1 
because in the comparison with data we consider populations instead of job offers and we assume that 
the two quantities are proportional, and consequently the fitting parameter we find is A pop = A? obs , which 
is always close to 1 irrespective of Aj b s given that c<l. 



Analysis on trips extracted from a mobile phone dataset. 

We use a set of anonymized billing records from a European mobile phone service provider [5ll31[l32]. 
The dataset contains the spatio-temporal information of the calls placed by ~ I0M anonymous users, 
specifying date, time and the cellular antenna (tower) that handled each call. Coupled with a dataset 
containing the locations (latitude and longitude) of cellular towers, we have the approximate location 
of the caller when placing the call. We analyze all call records collected during one day, and we define 
a trip when we observe two consecutive calls by the same user from two different towers. The type of 
mobility information obtained from the mobile phone data is radically different from that provided by 
the census data. In fact, the scope and method of the mobile phone data collection is complementary to 
the self-reported information of the census survey, and it offers the possibility to consider all trips, not 
only commuting (home-to-work) trips. Additionally, the mobility information that we extract from the 
mobile phone data is more detailed in both time and space. Indeed, we can observe trips of any duration, 
ranging from few minutes to several hours. In a similar manner, we can analyse trips on the much finer 
spatial resolution of cellular towers, whose average distance is ~ 1km, compared to the average size of 
counties, ~ 10km. We are therefore including in the current analysis many more trips, obtaining a more 
complete picture of individual mobility. 

In Figure [3] we use the trips obtained from the mobile phone data to provide a direct test of the models' 
fundamental prediction, i.e. the specific functional form of the trips distribution P(a). In the case of 
mobile phone data the trips' destinations are determined by the particular purpose of the users when they 
start the trip. Therefore, the variable a should now represent not only the number of job opportunities 
in a region, but rather the number of all possible venues that could be the destination of a trip, e.g. 
shopping centers, restaurants, schools, bars, etc. We therefore define the variable a(D), representing the 
number of possible points of interest in a circular region D centered at a given cell tower, as the total 
number of calls placed from the towers in D, assuming that a location's attractiveness is proportional to 
its call activity. We then calculate the empirical density distribution P(a)da, i.e. the fraction of trips 
to the towers between a and a + da, and we compare it to the various models' theoretical predictions 
P(a) = — ^P > (a), with P>(a) defined in Eqs. ©, (@]), and ([3]), and whose parameters, A = 0.99986 and 
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L = 0.00007, are obtained with least-squares fits. Moreover, we verified (plots not shown) that the result 
presented in Fig. [3] is stable with respect to other possible ways of defining a trip using the mobile phone 
data, e.g. between the two farthest locations visited by each user in 24 hours, or between the two most 
visited locations. 
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Figure 1. Definition of the variables used in the calculations, a) Notation used in Eq. [TUJ b) 

Configuration used to calculate the probability P(n\a,m) c) Configuration used in Eq. (fT2)l to calculate 
(To(O c )). 
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Figure 2. Testing the radiation model's theoretical predictions on commuting trips 
extracted from the US census dataset. a) We divide the commuting flows in deciles according to 
the population of the origin county, m, and for each set we calculate the distributions P(a\m). The 
values in the key indicate the mean origin population, m, of each decile. We use the population as a 
proxy to estimate the number of employment opportunities in every county, a, assuming in first 
approximation a linear relationship between population and job openings. b,c) The collapse of the 
distributions P(a) = P(a\m) ■ P>(m) on the theoretical curves Eqs. ([3]) and ([5]) predicted by the original 
radiation model and the radiation model with selection respectively. (See the section Materials and 
Methods for details). 
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a, number of calls 

Figure 3. Testing the mobility models on trips extracted from a mobile phone dataset. We 

analyze all call records collected during one day, and we define a trip when we observe two consecutive 
calls by the same user from two different towers. We define the variable a(D), representing the number 
of possible points of interest in a circular area D centered at a given cell tower, as the total number of 
calls placed from the towers in D, assuming that a location's attractiveness is proportional to its call 
activity. We then calculate the empirical distribution P(a)da, i.e. the fraction of trips to the towers 
between a and a + da (red circles), and we compare it to the various models' theoretical predictions 
P{a) = —fcP>(a), with P > (a) defined in Eqs. ©, ©, and ©, and whose parameters, A = 0.99986 and 
L = 0.00007, are obtained with least-squares fits (black lines). In the inset we show the plot in a log-log 
scale. (See the section Materials and Methods for details). 



